Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

نویسندگان

  • Samy Bengio
  • Oriol Vinyals
  • Navdeep Jaitly
  • Noam Shazeer
چکیده

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning. The current approach to training them consists in maximizing the likelihood of each token in the sequence given the current (recurrent) state and the previous token. At inference, the unknown previous token is then replaced by a token generated by the model itself. This discrepancy between training and inference can yield errors that can accumulate quickly along the generated sequence. We propose a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead. Experiments on several sequence prediction tasks show that this approach yields significant improvements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of artificial neural networks on drought prediction in Yazd (Central Iran)

In recent decades artificial neural networks (ANNs) have shown great ability in modeling and forecasting non-linear and non-stationary time series and in most of the cases especially in prediction of phenomena have showed very good performance. This paper presents the application of artificial neural networks to predict drought in Yazd meteorological station. In this research, different archite...

متن کامل

Multi-Step-Ahead Prediction of Stock Price Using a New Architecture of Neural Networks

Modelling and forecasting Stock market is a challenging task for economists and engineers since it has a dynamic structure and nonlinear characteristic. This nonlinearity affects the efficiency of the price characteristics. Using an Artificial Neural Network (ANN) is a proper way to model this nonlinearity and it has been used successfully in one-step-ahead and multi-step-ahead prediction of di...

متن کامل

Online Symbolic-Sequence Prediction with Recurrent Neural Networks

This paper studies the use of recurrent neural networks for predicting the next symbol in a sequence. The focus is on online prediction, a task much harder than the classical offline grammatical inference with neural networks. Different kinds of sequence sources are considered: finitestate machines, chaotic sources, and texts in human language. Two algorithms are used for network training: real...

متن کامل

Recurrent Neural Network Structured Output Prediction for Spoken Language Understanding

Recurrent Neural Networks (RNNs) have been widely used for sequence modeling due to their strong capabilities in modeling temporal dependencies. In this work, we study and evaluate the effectiveness of using RNNs for slot filling, a key task in Spoken Language Understanding (SLU), with special focus on modeling label sequence dependencies. Recent work on slot filling using RNNs did not model la...

متن کامل

Accuracy comparison of Elamn and Jordan artificial neural networks for air particular matter concentration (PM 10) prediction using MODIS satellite images, a case study of Ahvaz.

Due to the complexity of air pollution action, artificial intelligence models specifically, neural networks are utilized to simulate air pollution. So far, numerous artificial neural network models have been used to estimate the concentration of atmospheric PMs. These models have had different accuracies that scholars are constantly exceed their efficiency using numerous parameters. The current...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015